Skip to content

Conversation

@aganders3
Copy link
Contributor

@aganders3 aganders3 commented Aug 13, 2025

Summary

This adds optional chunk caching to the get() function to avoid repeated decompression of chunks when accessing overlapping selections or making multiple calls to the same data. I've noticed this can be a significant bottleneck when indexing data slice-by-slice, for example, where the chunks span many slices.

This may be a niche need, so no worries if this is out of scope for this library!

New API

  • cache?: ChunkCache in GetOptions
  • ChunkCache interface with get(key: string): Chunk<DataType> | undefined and set(key: string, value: Chunk<DataType>): any (meant to be compatible with simple Map)

Usage

  // Use built-in Map as cache
  const cache = new Map();
  const result = await get(array, selection, { cache });

  // Custom LRU cache
  const cache = new LRUCache({ max: 100 });
  const result = await get(array, selection, { cache });

  // No cache (default - unchanged behavior)
  const result = await get(array, selection);

Implementation

  • Cache keys use store_N:${array.path}:${chunkKey} format
  • Store isolation: WeakMap assigns unique IDs to store instances to prevent cache collisions when sharing caches across stores (perhaps not recommended)
  • Single cache can hold chunks from arrays with different data types

@changeset-bot
Copy link

changeset-bot bot commented Aug 13, 2025

🦋 Changeset detected

Latest commit: 7fe18d9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
zarrita Minor
@zarrita/ndarray Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant